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Abstract 

The calculation of detailed shadows remains one of the most diffi- 
cult challenges in computer graphics, especially in the case of ex- 
tended (linear or area) light sources. This paper introduces a new 
tool for the calculation of shadows cast by extended light sources. 
Exact shadows are computed in some constrained configurations 
by using a convolution technique, yielding a fast and accurate so- 
lution. Approximate shadows can be computed for general con- 
figurations by applying the convolution to a representative "ideal" 
configuration. We analyze the various sources of approximation 
in the process and derive a hierarchical, error-driven algorithm for 
fast shadow calculation in arbitrary configurations using a hierar- 
chy of object clusters. The convolution is performed on images 
rendered in an offscreen buffer and produces a shadow map used as 
a texture to modulate the unoccluded illumination. Light sources 
can have any 3D shape as well as arbitrary emission characteristics, 
while shadow maps can be applied to groups of objects at once. 
The method can be employed in a hierarchical radiosity system, 
or directly as a shadowing technique. We demonstrate results for 
various scenes, showing that soft shadows can be generated at in- 
teractive rates for dynamic environments. 

Keywords: Soft shadows, Convolution, Shadow map, Error-Driven 
illumination, Texture. 

1 Introduction 

The computation of soft shadows, i.e. shadows cast by extended 
light sources, is one of the most difficult challenges in rendering 
for computer graphics. Soft shadows are a result of the continuous 
variation of illumination across a receiving surface, when the light 
source becomes partially occluded by other objects in the scene. 
Their appearance is mainly controlled by the shape and location of 
penumbra regions, which are the regions on a receiver where the 
light source is partially visible. 

Soft shadows play a key role in the overall realism of computer- 
generated images, because they provide important visual cues about 
the 3D arrangement of objects [28]. The location of cast shadows 
with respect to the blocking objects informs the viewer about the 
main directions of illumination, and the sharpness of the penum- 
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bra helps understand the distance relationships between the source, 
blocker and receiver. 

Unfortunately, the calculation of soft shadows is also very diffi- 
cult. It can be restated as an area visibility determination problem, 
since the goal is to identify the regions of partial source visibility, 
as well as quantifying the relative area of the source that is visible. 
There are many methods for computing hard shadows (from point 
light sources), including some texture-based algorithms that can run 
in real-time on graphics computers. However, the two main avenues 
for the treatment of extended light sources each have severely lim- 
iting problems. Analytic techniques such as discontinuity mesh- 
ing suffer from excessive time and memory costs, and numerical 
robustness problems, while sampling techniques are prone to an- 
noying image artifacts unless they are pushed to a stage where they 
become too expensive. 

In this paper, we present a new method for the calculation of soft 
shadows, which is able to provide pleasant, artifact-free images in 
a very efficient way. The method is based on the calculation of 
shadow maps, which are textures created from images of the light 
sources and occluders using a convolution technique. The convo- 
lution is performed with images of the light source and the set of 
occluders, rendered in offscreen buffers. The shadow textures are 
then used to modulate direct light source illumination across the re- 
ceiving objects. Exact images are obtained for some specific cases 
(parallel polygons), while for general configurations some approx- 
imation is necessary. We analyze the error incurred and the various 
sources of approximation, and show how the overall approximation 
can be controlled using a spatial hierarchy of object clusters. This 
is achieved by combining shadow maps of the sub-clusters hierar- 
chically. 

The resulting error-driven algorithm automatically computes soft 
shadows at interactive rates for extended light sources of arbitrary 
shape and exitance distribution, while avoiding excessive approxi- 
mation under a feature-based error metric. The method can be used 
in any rendering technique, with the only requirement of a hierar- 
chy of spatial clusters in order to use the hierarchical combination. 
The algorithm is naturally adaptive and eliminates the difficulties 
associated with light source sampling. The error-driven hierarchi- 
cal combination of shadow maps lets us adapt the effort to user- 
specified approximation tolerances. 

Because it uses a single rendered image of the blocker to gen- 
erate the soft shadows, our technique effectively trades graphics 
performance for raw computing performance in the form of FFT 
calculations and image manipulation, which makes it interesting 
for computers with low or mid-range graphics capabilities. 

The remainder of this paper is organized as follows: Section 2 re- 
views previous approaches to shadow generation and discusses our 
goals; Section 3 explains the basic convolution method for comput- 
ing shadow maps, and Section 4 extends the technique to general 
source and receiver configurations. A number of implementation 
choices and details are presented in Section 5. Section 6 then dis- 
cusses the different sources of error and presents our error-driven 
hierarchical combination method for shadow maps. We present re- 
sults obtained in a variety of configurations in Section 7, discuss the 
merits of the approach in Section 8 and conclude in Section 9. 



2 Previous work 



Shadows and illumination techniques 



Woo et al. present an excellent survey of the vast literature on 
shadow algorithms [31], and we will only briefly review here some 
of the main approaches, especially the few that allow the computa- 
tion of soft shadows. 

Sampling methods 

Ray tracing algorithms compute shadows by casting a ray between 
a point lying on a surface, and a designated light source [29]. This 
blends very nicely with the rest of the ray tracing technique, but is 
quite expensive since each ray must in effect sample the scene for 
potential occluders. Soft shadows are generated in distributed ray 
tracing by casting several rays towards a set of sample points on an 
extended light source [5]. 

Another sampling option is to create a depth image from the 
point of view of the light source [30]. This shadow buffer can be 
used to check whether a given point, visible in the final image, is 
visible from the source. The severe aliasing issues experienced in a 
naive approach can be treated using elaborate depth filtering [21]. 
This approach uses a single point on the light source and therefore 
can not render penumbrae due to extended light sources. 

Using auxiliary data structures 

To avoid the cost of brute force sampling, a specific data struc- 
ture can be created that represents the visibility relationships in the 
scene. Such structures vary widely in complexity and cost, and 
essentially allow a time gain because they let us benefit from the 
coherence of visibility in space. 

Shadow volumes are constructed relatively easily from a point 
light source and polygonal occluders [6], and visible points can 
be quickly tested for inclusion in object space when rendering an 
image. Complex volumes can be represented and used efficiently 
through the use of BSP trees [3]. Approximate soft shadows are ob- 
tained by combining several shadow volumes, each corresponding 
to a sample point on the extended source [2]. 

A better structure for extended light sources records visibility 
information on the surfaces of the scene, in the form of a discon- 
tinuity mesh that includes all lines where the illumination function 
has discontinuities of various orders. Unless an occluder touches a 
receiver, the illumination function from an extended source is con- 
tinuous, and exhibits discontinuities only in its derivatives. Tech- 
niques for computing discontinuity meshes operate by considering 
all possible visual events and inserting critical lines in an explicit 
mesh structure [11, 15], which makes them both quite expensive 
to use and subject to robustness issues. However, they can provide 
exact visibility [8] and produce images of the highest quality. 

Interactive shadow generation 

Shadows can be generated while displaying the scene, using one 
or more extra rendering passes. This is of course especially inter- 
esting when hardware acceleration is available to perform the var- 
ious passes and combine the results. For instance, shadow maps 
from point or directional light sources can be created by the ren- 
dering pipeline and applied using texture mapping operations [23]. 
Shadow volumes can be combined using the stencil buffers [7]. 

The only method we are aware of for pre-calculating soft shad- 
ows at interactive rates is Heckbert and Herf's [13], where a num- 
ber of shadow images are created, registered and averaged on the 
receiver. Each image corresponds to a sample point on the light 
source, and they are all combined using the accumulation buffer 
[9]. This method works very well on high-end graphics machine, 
but essentially produces a superposition of "hard" shadows. The 
shadows cast by the individual samples are usually noticeable un- 
less a very large number of samples is used. 



The radiosity method is often credited with a unique ability to ren- 
der subtle effects such as soft shadows and illumination details. 
Radiosity techniques are based on a surface mesh used to com- 
pute, store and reconstruct illumination functions. The exchange 
of energy between mesh elements is evaluated using form factors 
to represent the effect of orientation, geometric attenuation and vis- 
ibility. Most modern form factor calculation methods actually de- 
couple the estimation of an "unoccluded" form factor based on the 
radiosity kernel, from that of a visibility factor expressing the frac- 
tion of the area of the source element that is visible from the receiver 
[4, 27, 10]. 

Zatz [32] pushes this idea further by proposing the separate cal- 
culation of sampled shadow masks to represent the effects of visi- 
bility in a separate step. Several authors observed that in the case of 
ideal diffuse scenes, the entire illumination can be recorded into ra- 
diosity textures. Such textures can be precomputed off-line, then al- 
lowing high-quality rendering with soft shadows at interactive rates 
[12, 18]. Keller's "instant radiosity" technique [14] computes ra- 
diosity textures in a manner similar to Heckbert' s, by averaging 
shadow images from point samples chosen on the surfaces. 

In order to simulate the complex shadows due to sunlight and 
skylight under tree canopies (as shown in the "Sun and Shade" 
movie [16]), Max used the convolution of a radiance image of the 
sky with a transparency mask of the canopy [17]. 

Multiresolution shadows 

Experimental evidence suggests that while shadows are important 
in a 3D rendering, they need not necessarily be exact [28]: this is 
well known by drawing artists who often sketch an approximate 
shadow with "appropriate" characteristics to increase realism. This 
idea was applied to the calculation of multi-resolution visibility fac- 
tors in the context of hierarchical radiosity and clustering [24]. For 
a given source/receiver pair, an "appropriate" level in the hierarchi- 
cal representation of the occluders is selected, and used to create 
an approximate shadow based on an analogy with semi-transparent 
volumes. This work effectively produces shadows of variable reso- 
lution and cost, but does not provide bounds on the error incurred. 
Such bounds can be computed in a simplified 2d case [26] but ap- 
pear very difficult to compute in 3D, mainly because the identifica- 
tion of a cluster to a semi-transparent object is too crude. 

Discussion 

Our goal is to provide a shadowing algorithm running at interac- 
tive rates, in a manner similar to Heckbert's. However, we want to 
avoid the sampling artifacts produced when averaging hard shad- 
ows, without having to resort to very large numbers of samples 
(more than 100 can be necessary for large sources [20]). On the 
other hand, we do not want to build expensive data structures to 
represent visibility, but rather to compute necessary information on 
the fly. The convolution algorithm explained below can be seen 
as an extension of Max's method [17], and meets these goals by 
always providing a smooth image in soft shadow regions. 



3 Obtaining soft shadows with convolution 

In this section we present the basis of our technique in the form of 
an algorithm for producing a shadow map across a given receiver, 
subject to the illumination of an extended light source and to shad- 
ows cast by a set of objects. We first explain how the shadow can be 
expressed as a convolution operation, in the special case of parallel 
objects, and then propose an extension to the general case. 




(a) Parallel configuration 



(b) Source image 



(c) Blocker image 



(d) Convolution 



Figure 1: A simple case of parallel light source (S), occluder (B) and receiver (R). The source image is convolved with the blocker image to 
obtain the shadow map. 



3.1 Convolution formula for a set of parallel objects 

Let us first consider the special case where the light source, the 
receiver and the occluder are all planar, and lie in parallel planes 
(Figure 1). The irradiance [25] at a point y on the receiver is: 



H(y): 
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where E is the exitance [25] of the source, d(x,y) the distance 
between x and y, 9 and 9' the incident angles of the ray x — > y on 
the source and the receiver, and v(x,y) a binary visibility function 
indicating whether x and y are mutually visible. 

A common approximation, e.g. in radiosity algorithms, consists 
of separating the visibility factor in a distinct integral to obtain: 
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The first term Fs(y) is the unoccluded point-to-polygon form fac- 
tor from y to the source, and the second term V(y) is the visible 
area of the source as seen from y. This approximation implicitly 
assumes a low correlation between the variations of visibility and 
the radiosity kernel, an assumption that is reasonable in most cases. 
In this paper we are focusing on the calculation of V(y). The un- 
occluded form factor can be computed using integration formulae 
[22] or approximated using the hardware shading model [13]. 

Computing V(y) is equivalent to projecting the blocker onto the 
source from y and measuring the remaining unoccluded area of the 
source. In the present case, because all three components are paral- 
lel, the projection of the blocker simply translates on the source as 
y moves on the receiver. This is precisely why the unoccluded area 
of the source can be expressed as a convolution between the source 
and blocker images. 

More formally, let us now introduce the following characteristic 
functions of the source and blocker in their respective planes: 



S(x) 
P(x) 



1 if x is on the source 

0 elsewhere 

0 if x is on the occluder 

1 elsewhere 



We can use P to express the binary visibility value between two 
points x and y, by introducing the point of intersection of the xy 
line and the plane of the blocker. The visibility factor can then be 
written as 

v(y)= f P (*£±*y" 

V 7 Js V di^d 2 

To show that this expression is a convolution, let us transform 
V(y) by extending the integration over the entire plane: 
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and * denotes a convolution operation. Therefore, in this particular 
geometric configuration, the visibility factor reduces to the convolu- 
tion of the scaled characteristic functions of the source and blocker. 

Note that this particular form of the visibility term implies that it 
is continuous, therefore implicitly creating soft shadow variations 
on the receiver. An example of convolution between source and 
blocker images is presented in Figure 1 . For a diffuse surface, we 
can express the radiosity function B on the receiver by introducing 
the diffuse reflectance p(y) and using Eq. (1): 
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(3) 



Therefore, a possible algorithm for displaying soft shadows is to 
compute a shadow map using the convolution formula, and use it as 
a texture to modulate the illumination function p(y)EFs(y) across 
the receiver. 



3.2 Computation of soft shadows in general configura- 
tions 

We will see in Section 5 that Equation (3) can be used in an effi- 
cient algorithm to create illumination textures. But its value is of 
course severely limited by the assumption that all objects are pla- 
nar and parallel. In real applications, not only can light sources and 
receivers be placed at arbitrary orientations, but occluders can also 
in general occupy a complex volume in 3D. 

In a general source/blocker/receiver configuration, it is not possi- 
ble to derive a convolution formula similar to Eq.(3). Nevertheless, 
we propose to approximate the resulting shadow effect by using the 
convolution method for a virtual geometry that obeys the preceding 
requirements, and transform the associated result to fit the actual 
geometry of the scene. This involves the following operations: 

a) choosing a direction (D and a set of three planes containing 
respectively a virtual source S v , a virtual blocker B v and a 
virtual receiver R v , all planar and orthogonal to *D\ 



b) computing the illumination function on the virtual receiver us- 
ing the convolution formula (3); 

c) projecting the result back on the actual receiver. 

Clearly, depending on the actual geometry of the scene, such an 
approximation may produce some artifacts. The different sources 
of error and the way to control them are addressed in Section 6. We 
now discuss each of these steps in more detail. 



4 Extensions to the basic principle 

In this section we show that our convolution method can be adapted 
with little modification to even more general lighting conditions. 
Non-uniform light sources, sources with complex 3D shapes, and 
complex receiver shapes can all be simulated. Practical examples 
for each of the three cases are presented in Section 7. In particular, 
we show that groups of objects can be used to model sources or 
receivers in a single shadow map calculation. 



3.2.1 Choice of the virtual geometry 

The choice of the direction of projection CD is the first issue to be 
addressed. Obviously, the nature and importance of the approx- 
imation will largely depend on this choice. We will discuss this 
question in more detail in Section 5.1. For now, we observe that it 
seems natural to have CD be some average of the directions actually 
involved in the transfer of energy. Therefore we suggest as a pos- 
sibility to choose CD to be the mean direction of all possible rays 
between the source and the receiver (Figure 2. (a)). 

Once CD has been chosen, it defines the orientation of the three 
virtual planes. Let us denote by Z an axis parallel to CD. Then, we 
choose altitude values z s ,Zb and z r for the three virtual planes (Fig- 
ure 2.(b)). The choice of these values is discussed in Section 5.2. 

Each component is now projected onto its virtual plane: The vir- 
tual source is obtained from the source by orthographic projection 
along CD. Thus, viewed from the blocker, this virtual source has 
nearly the same aspect as the original one (See Figure 2.(c)). 

The virtual blocker is the projection of the original blocker on the 
virtual blocker plane. The projection used is a perspective projec- 
tion, with eye set to the center of the source (See Figure 2.c). Using 
the same projection onto the virtual receiver plane, we obtain the 
virtual receiver from the original one. Viewed from the center of 
the source, the original and virtual blockers (resp. receivers) are 
thus identical. 





(a) 



Figure 2: Construction of a virtual source, blocker and receiver for 
a general shadow configuration, (a) Choosing a preferred direction, 
(b) Choosing altitudes for the virtual planes, (c) Projecting the orig- 
inal elements to obtain their virtual counterparts. 



4.1 Dealing with non-uniform radiosity over light 
sources 

When the source is not uniform, but still planar, we can modify the 
derivation of Section 3, replacing the uniform exitance term E by a 
non-uniform exitance E(x): 
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Replacing S by S x E in the derivation of Equation 3, the "visibility" 
term V (y) turns into: 
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This is another convolution, which can easily be calculated us- 
ing our method by equipping the source with a texture containing its 
relative exitance function before rendering it to the offscreen buffer. 
We essentially include the variations of the source's emission in the 
visibility integral, with the double advantage that (a) the calculation 
of the unoccluded illumination is not modified, and (b) the poten- 
tial correlation between visibility and source emission is properly 
accounted for. 

Note that the same approach could lead to adapt our method for 
translucent occluders by replacing the binary term P(x) by a more 
general one varying in the range [0,1]. However, translucent object 
generally operate in a non linear manner on light propagation (be- 
cause of refraction and diffusion), which prevents us from deriving 
a proper convolution based formula. 

4.2 Complex light source shapes 

Three-dimensional light sources do not require much more compu- 
tation than a planar light source: All we need is the projection of the 
source onto its virtual plane. Apart from computing its projection 
in the offscreen buffer, using a volumetric source requires attention 
to be paid to the choice of direction CD, which will not follow the 
same criteria as those described for a planar light source. 



3.2.2 Back to the actual geometry 

Once computed for the virtual receiver, the visibility term V(y) is 
projected back to the actual receiver where it is multiplied by the 
direct illumination factor from the source pEFs(y). In practice, 
the convolution image is set as a shadow texture and modulated 
by direct illumination values for uniformly sampled points on the 
receiver. 



4.3 Simultaneously shadowing a group of objects 

Just as for a polygonal receiver, a shadow map can be assigned to 
an entire cluster. The shadow map is then shared among objects, 
while each surface receives its own texture coordinates. 

Note that this does not address self shadowing in the cluster, 
which may be achieved by applying our method using one part of 
the receiving cluster as a potential occluder for the remaining part. 



5 Practical computation of shadow textures 

We now describe our implementation, in which we use the convo- 
lution operation to create soft shadow textures. We have integrated 
this algorithm in our research testbed for hierarchical radiosity, but 
it should be noted that it can be used in other environments as well. 
We make use of two features of the radiosity system: first, we use 
the form factor calculation routines to evaluate the unoccluded il- 
lumination term. Second, we use the hierarchy of object clusters 
to select potential occluders between a given pair of source and re- 
ceiver. Other techniques could be used to compute the illumination, 
such as Heckbert's combination of hardware point sources [13]. As 
for the cluster hierarchy, common structures such as hierarchies of 
bounding volumes are easily constructed and provide the necessary 
hierarchy. 

Let us assume for now that we have selected a light source, a 
receiver object and a cluster of occluders. Such configurations can 
be automatically selected by ranking their potential for the creation 
of soft shadows, as a function of their absolute and relative sizes 
and distances. For each of the issues discussed below, we suggest a 
suitable strategy or solution. 

5.1 Choice of the direction of projection 

As suggested earlier, the direction of projection <D must adequately 
represent the set of all possible rays between the source and the 
receiver. If the source and receiver are planar surfaces, we first 
determine a useful receiver by clipping the receiver by the source 
plane and a useful source by clipping the source by the receiver 
plane. This operation prevents *D from being parallel to the source, 
which would produce a empty source image, or parallel to the re- 
ceiver, which would make the computed texture projection fail. We 
then restrict the set of rays between the useful source and receiver 
to rays that actually encounter the blocker (that is, the extent of the 
cluster's bounding box). The direction of projection is chosen as a 
median value into this set. 

The choice of the direction <D does not affect the placement of 
umbra and penumbra regions, but for some special cases, such as 
the subdivision of the receiver, we shall see that it can be important 
not to choose *D for each receiver independently. 



ages of the virtual source and blocker are obtained by rendering the 
objects that constitute the real source and blocker, using the projec- 
tions previously described, in an offscreen buffer of desired size. 

Source and blocker frustum are scaled to achieve the required a 
and 1 + a scaling factor with respect to the intrinsic dimensions of 
the objects, as dictated by Equation (2). 

Polygons are rendered in white over a black background, with 
no z-buffering. Note that a non-uniform source is rendered with a 
texture modulating its color to follow its relative exitance function. 
The blocker image is inverted while reading pixel values from the 
offscreen buffer. In addition, the negative sign in the convolution 
Equation (2) means that the source image must be reflected across 
its horizontal and vertical axes. This is achieved by scaling the ge- 
ometric model of the source with a negative factor when rendering. 

Selection of the blocker frustum 

The blocker frustum actually used is computed as the intersection 
of the receiver and blocker frustums viewed from the source (Fig- 
ure 3). This avoids the computation of large unshadowed areas 
when the blocker is too small, and the computation of too large 
a texture when the receiver is too small or when the source is very 
close to the blocker. The near mdfar clipping planes are set so as 
to capture the blocking polygons in this regions and to avoid pro- 
jecting irrelevant geometry. 




Blocker frustum 
Useful frustum 
Receiver frustum 



Figure 3: Construction of the blocker projection frustum 



5.2 Choice of the virtual planes 

The choice of the altitudes of the virtual planes directly affects the 
size of the resulting penumbra regions in the computed texture. Al- 
titudes for the virtual planes of the receiver, source, and blocker 
could simply be chosen as the centers of the altitude ranges of 
the three elements (See Figure 2.(b)), but more accuracy can be 
achieved on the resulting shadow texture by choosing the altitudes 
so as to obtain penumbra regions of median sizes in the range of 
those actually produced. 

We will explain in Section 6.2 how to compute the size of the ac- 
tual and computed penumbra. Using this calculation we can com- 
pute an "optimal" virtual blocker altitude which creates the desired 
median size: denoting by z s and z r the virtual source and receiver 
altitudes, and by z min and z max the extremal altitudes of blocking 
objects, the optimal altitude for the virtual blocker is 

i-) min i r\ max 

z° b pt = X \ + " 2Zb where D x =z s -^ and D 2 = z s -zf" 

5.3 Sampling the virtual source and blocker character- 
istic functions 

In order to perform the convolution, we need two images of the 
scaled characteristic functions following Equation (3). These im- 



Computing the convolution 

Once computed, the source and blocker images can be convolved 
using the following well known property: 

(f*g)(y) = 7-H7(f)*7(g)) 

where 7 (J) denotes the Fourier transform of function /. Since 
we are dealing with 2D images, we perform a two dimensional FFT 
on each image, multiply them and finally transform the result by 
a normalized inverse FFT. The result is a sampled version of the 
visibility function V(y). We use the standard FFT library supplied 
by SGI on our systems. 

5.4 Security zone for proper convolution 

When performing the convolution of two images with a Fourier 
transform, we implicitly assume the images to be periodic. This 
is obviously true for the source image by construction (because the 
source is strictly contained in the image), but it is not always the 
case for the blocker image, because of the clipping operation by the 
receiver frustum. As a result, the sum of the image space sizes s\ 
and S2 of the windows actually occupied by the source and blocker 
sampled functions must not exceed the total sampling window size 



s. To ensure this property, we further scale both frustums by a fac- 
tor of —j— , which is the secure scale factor that allows the greatest 
relative resolution for the effective texture (See Figure 4). 




(a) (b) (c) 



Figure 4: An example convolution between a source image (a) and 
a blocker image (b), for which the receiver clips the blocker frus- 
tum. The red square on the blocker image indicates the interesting 
area for the given receiver. The source and blocker frustums have 
been equally scaled until s\ + S2 < s (note that enlarging the frus- 
tum reduces the effective size s\ and 52). In the resulting image 
(c), pixels outside the blue region are spoiled by FFT wrap-around 
effects and are not used in the shadow map. 



5.5 Resolution issues 

The resolution of the shadow map should be chosen carefully. On 
the one hand, it determines the resolution of all auxiliary images for 
the convolution operation, and the cost of the convolution itself (See 
the results in Section 7). On the other hand, it should be appropriate 
for the size of the receiver in object space and the variations of 
penumbra across its surface: Whereas a nearly hard shadow due to 
a small spotlight demands a large texture to be rendered accurately, 
a very smooth shadow mainly based on penumbra does not require 
very dense sampling. Such situations can be easily characterized 
since they simply depend on a. Section 7 shows practical examples 
of scenes with the texture sizes used. 

Due to the different scaling factors, and especially for small val- 
ues of a, the source sample can actually have a different area (ra- 
tio between the numbers of white and black pixels) than that of 
an ideally sampled image. This area plays an important role in 
the texture as it determines the maximum value of the convolu- 
tion. Thus, a wrong area value on the source produces inappro- 
priately normalized shadow maps, with annoying discontinuity ar- 
tifacts. This problem can be addressed using antialiased rendering 
for the source, so that the area of the source sample has a more ac- 
curate value. Unfortunately, depending on the OpenGL implemen- 
tation, the value of a pixel in an antialiased polygon is not always 
exactly the area of the pixel fractions covered by the polygon [19]. 

Blocker aliasing does not affect the resulting textures in the same 
way, but antialiasing is also required for the blocker characteristic 
function to avoid inconsistencies or discontinuities in the penumbra 
regions, caused by very long and fine blocking objects. The images 
of the mobile in Figure 12 would be particularly affected without 
antialiasing. 

5.6 Using the shadow texture 

As previously stated, the computed convolution image is used as 
a shadow texture on the receiver, modulated by direct illumination 
values. 

Since we need to represent the variations of the unoccluded illu- 
mination across the receiver, we create -for display purposes only- 
a regular mesh of vertices P ? . Each vertex of this mesh is equipped 
with a color value computed as p(Pi)EFs(Pi). 



When rendering, the receiver is displayed as a textured triangle 
strip set. For each vertex, texture coordinates are computed by pro- 
jecting the corresponding receiver point onto the virtual receiver 
plane. These coordinates can either be pre-computed and stored 
with the mesh, or directly computed by OpenGL by adequately set- 
ting the texture projection matrix[19]. However, since texture co- 
ordinates are provided only for those vertices, the mesh size must 
account for both the illumination gradient on the receiver, and the 
strength of the deformation due to the projection from the virtual 
receiver. Practically, typical mesh sizes range from 2x2 for small 
polygons (For example the cubes in Figure 14) to 20 x 20 for walls. 

The unoccluded point to area form factors are computed using 
the exact point-to-polygon formula[l]. When the receiver is a clus- 
ter of objects, each surface receives its own display mesh of ade- 
quate size. 

6 Error-driven shadow computation 

Although, by construction, our method places umbra and penum- 
bra regions in the right place, the different approximations do not 
lead to exact illumination values. In this section, we first examine 
the different sources of error and the way to quantify this error. We 
then review possible refinement techniques to produce more accu- 
rate results, and finally present a hierarchical algorithm to compute 
the shadow texture with a given precision. 

6.1 Qualitative discussion of error sources 
Virtual blocker 

To characterize the error due to the use of the virtual blocker, let 
us consider the case where the receiver is parallel to the source. 
Since the light source is not a single point, the umbra of the (pla- 
nar) virtual blocker will differ from that of the actual blocker in the 
following two respects: 

• When projecting the actual blocker to the virtual one, all the 
triple-edge discontinuity curves [15] of the discontinuity mesh 
collapses into Edge- Vertex events. This modification of the 
discontinuity mesh's internal topology affects the illumination 
gradient into penumbra regions. This effect is all the more 
noticeable that the source is large. 

• Since all parts of the virtual blocker share the same altitude, 
all computed penumbra regions will have the same sharp- 
ness. This is the most obvious visual effect of using a virtual 
blocker. 

Projection on the receiver 

Let us consider the blocker to be planar and parallel to the source, 
and study the difference between the umbra directly computed on 
the actual receiver and that projected back from the virtual one. 

Although the projection of the shadow texture back onto the ac- 
tual receiver tends to produce a general shadow region of the right 
size, it also conserves the size ratio of penumbra and umbra regions. 
This ratio on an actual receiver strongly depends on the distance 
between the receiving point and the source. Thus the computed 
shadow on a large receiver may show umbra where there actually is 
penumbra, or the reverse. 

6.2 Measuring the error 

A simple way to estimate the error would be to derive a bound on 
the difference between the exact and computed shadow functions, 
depending on the virtual geometry parameters. Although such a 



method based on standard L\,L2 or Loo distances would produce 
conservative bounds on the global error, it would not allow a reli- 
able characterization of the shadow artifacts, because of its inherent 
non-locality and its lack of coherence towards human perception 
criteria [24]. We instead consider a form of perceptual error, and 
study its variation in terms of the virtual geometry parameters. 

The most noticeable artifact due to the use of the virtual ge- 
ometry is the production of penumbra regions of inadequate size. 
We propose to estimate the ratio between the computed and exact 
penumbra sizes. The range of variation of this ratio for all points in 
the configuration will serve as an error measure for the use of the 
virtual geometry. 



Source 




Virtual receiver 



Figure 5: Sizes of computed and exact penumbra regions for a 
given source/blocker/receiver configuration and virtual planes al- 
titudes z s ,Zb and z r - 

Our error estimate is derived below with a simple reasoning in 
two dimensions. As shown in Figure 5, the penumbra regions on 
the virtual receiver (follow red lines), due to a polygon side of the 
virtual blocker, at a given altitude Zb, have size 



P'J = cc5 v where a = 



Z s -Zb 



After projection (blue lines) onto the actual receiver, at altitude z, 
the computed penumbra will have approximate size: 



Pl(z) = vL s 



1 Z s -Z 

' cosy z s — Zr 



The penumbra due to the actual blocking polygon at altitude z' 
can also be approximated by: 



1 z'- 
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Figure 6: Altitude ranges for the blocker and receiver 



Let us assume that the set of blocking objects lies between al- 
titudes z™ in and z™ ax , and that the receiver is bounded by z™ in and 
z max (pig Ure 6) assume without loss of generality that 

max ^ jnin 
Z r < Z b 

In this case, the approximation error A e (z',z) reaches its maximum 
value for z = zf ax and z' = z™ in , and its minimum value for z = zf in 
and z' = z™ ax . The difference between these two extremal values is 
the maximal error amplitude for the current configuration: 



E max (Receiver,Blocker) = A^V^) ~ M**"*,^ 



(4) 



cosy^-z' 



As expected, this error estimate decreases to zero when the 
blocker and receiver become planar and parallel. 

6.3 Reducing the error 

Now that we can estimate the amount of approximation incurred, 
we consider the options available to reduce it. We first list all po- 
tential parameters of the problem, and focus on the combination 
of several shadow maps corresponding to sub-clusters of a given 
occluder. 



6.3.1 Parameters influencing shadow quality 

Source subdivision Subdividing the source would help reducing 
the discontinuity mesh topological error described in Section 6.1, 
and also improve on the kernel-visibility low correlation assump- 
tion. Since these two kinds of error do not significantly affect the 
visual aspect of the shadow, except for very large light sources, we 
currently ignore this option. 

Image resolution Improving on the shadow texture resolution or 
on the choice of the direction of projection helps reduce their spe- 
cific error, but it does not lead to arbitrarily accurate shadow tex- 
tures, in terms of penumbra accuracy. 

Therefore, it is generally more efficient and practical to subdi- 
vide either the receiver or the blocker as sketched in Figure 7. 



Thus, the relative error between the true and computed penumbra 
will be 



A (J ,\ - P e^ - (Zs-Z')(z s -Z) 



Zb~Zr 



(Z S ~ Zr)(z s ~ Zb) 



Receiver subdivision Subdividing the receiver into two or more 
sub-receivers accounts for the different ratios of characteristic sizes 
of umbra and penumbra for different receiving regions (Figure 7. a). 

In this case, a shadow texture is computed separately for each 
sub-receiver, using the convolution method. This increases the 



Automatic choice of direction Source direction Receiver direction 

Figure 8: Artifacts produced along a subdivision boundary on the receiver. 
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Figure 7: (a) A receiver subdivided into two receivers, with their 
associated virtual receiver, (b) A blocking cluster subdivided into 
two blocking sub-clusters, with their associated virtual blockers. 



computation time, due to the larger number of convolutions to com- 
pute, but takes into account different a configurations for the dif- 
ferent parts of the subdivided receiver. 

Particular attention should be paid to the choice of direction 
D for each sub-receiver, as illustrated in Figure 8. As each tex- 
ture is projected back to its own receiver, boundary artifacts may 
appear: in this Figure, a receiver patch is subdivided into four re- 
ceivers (left). The next three images (from left to right) show a 
shadow detail in the boundary region of two sub-receivers, with 
three different choices for (D\ 

automatic <D is chosen independently for the four receivers as 
described in Section 5.1. Note the discontinuity along the 
boundary, due to non-matching shadow textures. 

source D is the source's normal direction. The four receivers thus 
have the same direction *D, but penumbrae still have differ- 
ent sharpness. The discontinuity is barely noticeable in the 
penumbra. 

receiver D is the receiver's normal. The four receivers have the 
same virtual plane, and shadows fit together perfectly. 

Thus we see that a proper choice of *D (the same direction for all 
receivers, in this case) can reduce or eliminate most of the artifacts. 

Blocker subdivision When a large set of blocking objects 
(grouped in a cluster) projects shadows on the receiver, it produces 
penumbra regions of different sizes. In such a configuration, subdi- 
viding the receiver would not suffice. We can expect a more accu- 
rate result by considering separately subsets of objects in the cluster 
(Figure 7.b), computing their associated shadow texture using our 
convolution method with suitable virtual blockers. 



Such a subdivision requires a procedure for combining shadow 
maps created from the subclusters into a shadow map correspond- 
ing to the entire blocking cluster. This issue is addressed in the next 
section. 

6.3.2 Combination of shadow maps from different subclusters 

When two blockers are treated separately, all information on their 
spatial correlation is lost. Thus, exact recombination of the two 
shadow maps requires the knowledge of the correlation function 
of the two blockers. The simple experimental case described in 
Figure 9 illustrates the impossibility of retrieving an exact shadow 
map value knowing only that of the subclusters. 
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Figure 9: Extremal situations for blocker-to-blocker correlation. 
For any receiver point y on the bold line, the visibility values V\ (y) 
and V2(y) for each separate blocker are exactly jS. But the actual 
visibility V(y) is in the left-hand case, and 0 in the right-hand 
one. These extremal values are in fact those given by Equation (5). 

We propose an approximation method to achieve such a combi- 
nation for the case of a subdivision into two subclusters. This com- 
bination method generalizes readily for more sub-clusters. Let us 
call V\ (y) and V2 (y) the shadow maps computed for each sub clus- 
ter separately, and V(y) the shadow texture associated to the parent 
cluster. Recalling that the value of the shadow texture is the area 
of the portion of the source that is visible from a receiver point, we 
can consider the worst and best correlation cases between blockers 
1 and 2 and write: 

V 2 (y)-(S-V 1 (y)) < V(y) < min(V 1 (y) 1 V 2 (y)) 

and thus: 

msn(0,V 1 (y)+V 2 (y)-S) <V{y) <min(Vi(y),V 2 (y)) (5) 
(S is the area of the source). Thus, we can use the following median 



A test scene. 



8 = 1.0, 1 cluster, 160 ms. 8 = 0.645, 2 clusters, 340 ms. 8 = 0.452, 10 clusters, 1055 ms. 



Figure 10: Hierarchical com- 
bination of shadow maps using 
a variable number of clusters. 
In each image, the final shadow 
map is assembled from as many 
partial maps as there are se- 
lected clusters. Cluster selec- 
tion is performed using the er- 
ror estimation described in the 
text. The reference solution was 
obtained with ray casting. 




8 = 0.387, 34 clusters, 3.38 s. 



value for the combined texture, as an approximation of V(y): 
V h2 (y) = ^(mm(V l (y),V 2 (y)) + max(0,V l (y) + V 2 (y)-S)) (6) 

The maximum error incurred by this approximation arises in the 
two configurations depicted in Figure 9 where it reaches the value 
3$. It should be noted that such configurations rarely occur, only 
when two polygons sharing an edge (as viewed from the source) are 
treated as separate blockers. This is the case in the last image of Fig- 
ure 10 where the 210 polygons forming the cubes have been used 
as separate blockers. The visible effect is a slightly faster variation 
of the penumbra around the shadow of the cubes in comparison to 
the reference solution. 

6.4 Shadow approximation algorithm 

We can now organize the preceding elements into a complete hierar- 
chical refinement algorithm, controlled by explicit error estimation. 

Refinement criterion 

For a given configuration, Equation (4) gives an estimation of the 
error due to the use of the convolution method. By comparing the 
error estimates for the two separate cases of blocker and receiver 
subdivision, we can decide which choice leads to the smallest error 
on the final texture: 

E(Ri,...,R n ,B) = max( (Ri,B),... 

E(R,B U ...,B P ) = C Err + min(E max (R,B 1 ),...,E max (R,B p )) 

Ceyy is a combination error term, that is a bound on the error due to 
the correlation of sub-cluster shadow maps. It turns out to be negli- 
gible in practice. An important property is that for any subcluster b 
of B, and any sub-receiver r of R, we have 




8 = 0.0, 210 clusters, 21s. Reference solution. 



Equality occurs only when b = B and r = R. The refinement al- 
gorithm is summarized in Figure 11, where the procedure Convo- 
lutionTexture(S,B,R) computes the shadow texture using the given 
source, blocker and receiver. CombineTextures performs the com- 
bination of Eq.6 and PasteTextures makes a single texture from the 
texture of the four sub-receivers. When subdividing the cluster of 
occluders, we save and re-use the Fourier transform of the source 
image, and adapt the scale of the occluder images to account for 
the fixed size of the source, thereby saving the cost of one FFT per 
occluder used. Figure 10 shows the results of the application of this 
algorithm for different error thresholds. It clearly shows that for 
a single cluster, all penumbra regions have the same extent, while 
the range of possible penumbra sizes increases with the number of 
clusters. The reference solution was computed using ray casting on 
the true geometry, with 1024 rays per pixel. 



Texture ComputeTexture (S, B,R) 
iiE max (S,R,B)<z max 

return ConvolutionTexture(5 , ,fi,/?) 
else 

if E(R U ...,R n ,B) <E(R,B u ...,B p ) 
T\ = ComputeTexture (S, B,Ri) 

T n = ComputeTexture (S, B,R n ) 
return PasteTexture (Ti,...,T n ) 
else 

T\ = ComputeTexture (S, B i,R) 

T p = ComputeTexture (5, B p ,R) 
return CombineTextures (T\,...,T p ) 



Figure 1 1 : Algorithm for shadow texture computation 



E(r,b)<E(R,B) 



Discussion of convergence 

Each leaf of the cluster hierarchy contains one or more surfaces, ar- 
bitrarily oriented. Thus we cannot refine the blocking clusters into 
arbitrarily small subclusters. The maximum extent 5 Z of atomic ob- 
jects in the blocking cluster produces the minimum possible error 
of the computed texture, using criterion (4). The associated tex- 
ture artifact will be localized in the shadow region produced by the 
object, and can be imputed to the object model quality. 

Conversely, the possibility of refining the receiver is only limited 
by the allowed computation time for the shadow map. 

Our subdivision criterion does not take into account the size and 
orientation of the source, which means that it does not capture the 
totality of the error, and that the images do not fully converge to the 
true images. 



7 Results 

We present in this section a number of images illustrating the results 
of our algorithm. As a general rule, shadowed polygons (walls, 
objects) are entirely illuminated using our convolution method, 
whereas blockers themselves are illuminated using standard radios- 
ity computation, without any extra visibility treatment. Computa- 
tion times are given for the shadow texture treatment only, since 
any illumination method could be used for other objects, including 
hardware lighting. 



7.1 Breakdown of computation time 

The scenes used for the following images contain between 212 and 
45,000 polygons, mainly concentrated into the blockers. Offscreen 
rendering times range from less than 1 ms to about 160 ms on SGI 
Onyx2/iR and 02 computers. The cost of one FFT calculation is 
proportional to nlogn, where n is the number of pixels in the im- 
ages. Therefore this cost is very sensitive to the resolution of the 
shadow maps. On the Onyx2 we observe the following computa- 
tion times (in milliseconds, for the calculation of a single represen- 
tative map such as that the floor). The corresponding images can 
be seen in Figure 10 (Cubes, 212 polygons), Figure 14 (Pyramid, 
4340 polygons) and Figure 12 (Mobile, 45000 polygons). 
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For comparison, the same operations on the 02 take similar 
amounts of time: 
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In each case we indicate the total time to compute a shadow map, 
and in the FFT column the time for a single FFT operation. Three 
FFTs are needed to obtain a shadow map, and other image-based 
operations take another 140% of the time of a single FFT. We see 
that the cost of FFTs dominates for textures of 512 2 pixels and 
higher. Note that these are fairly large texture sizes, used only for 
large polygons (walls...) in the images. Smaller textures suffice for 
most objects. 



7.2 Hierarchical combination of shadow maps 

Figure 12 demonstrates the use of the hierarchical shadow map 
combination of Section 6.3.2. The scene contains 45,000 polygons, 
mostly in the complex objects attached to the mobile. The resolu- 
tion of the three shadow maps is 256 x 256 for the images shown, 
although it should be noted that half this size produces almost in- 
distinguishable results. Therefore we indicate computation times 
for both 128 and 256 resolution. 

7.3 Casting shadows on several surfaces at once 

Figure 14 shows a cluster of cubes for which a single 128 x 128 
shadow map has been calculated. Each polygon of the cluster is 
equipped with a coarse display mesh (2 x 2 to 3 x 3) containing 
unoccluded form factor to the source. 512 x 512 shadow maps are 
used for the floor and walls. The same occluding cluster containing 
the plant has been used for all maps. The plant itself is made out of 
4, 340 polygons. 

7.4 Complex light sources 

Figure 13. (a) illustrates the lighting effects that can be simulated 
when a light source with a complicated shape casts shadows. The 
"98" shape is three-dimensional text, made of 360 triangles. A 
256 x 256 texture has been computed for each of the three walls 
using the "hole" cluster as an occluder. 

Figure 13.(b) shows how a single convolution can create the ef- 
fect of several small sources. Note that the stepping effect is nor- 
mal here, because there really are four distinct small sources in the 
scene. A single image of the cluster of light sources is used (shown 
in Figure 13.(c)). 

Figure 13.(d) demonstrates shadows cast by elongated light 
sources: with neon tubes, penumbra regions are smooth in the di- 
rection parallel to the tubes, but exhibit a stepping effect in the di- 
rection perpendicular, to the axes of the tubes. 

8 Discussion 

Our results demonstrate the advantage of a convolution method 
over an explicit sampling method, in that penumbra regions are al- 
ways continuous. Note that we are still performing a discrete sam- 
pling of the source via the offscreen rendering step, but we are able 
to treat many samples in a single operation (and the samples are 
antialiased). Our method can be considered to encompass Heck- 
bert's [13] since we can simulate an extended light source with non- 
uniform exitance distribution, where only a fixed number of sample 
points have non-zero energy. In our images, we have chosen to use 
light sources of small or moderate size, so that shadows contained 
identifiable penumbra/umbra regions. Naturally our method works 
for light sources of any size, whereas sampling methods would have 
to use very large numbers of samples. 

Even when no subdivision is performed (i.e. with a single oc- 
cluding cluster grouping all potential occluders), the method pro- 
duces visually pleasing images without stepping effects. Note that 
all occluders are always taken into account exactly once with their 
complete shape, no matter how many clusters are used in the hier- 
archical subdivision. In this respect, our method provides a better 
solution to multi-resolution shadow calculation than the simple vol- 
ume approximation of [24]. As more hierarchical levels are used to 
recombine shadow maps, better shadows are obtained, and the com- 
putation time becomes dominated by the cost of the FFT. 

Interestingly, we observe that our method in essence trades 
graphics performance for raw compute power, since it renders a 
single image of the source and occluder but requires a number of 
FFT calculations. This paradigm shift appears consistent with the 



1 cluster, 660 ms / 980 ms. 7 clusters, Lis / 2.75s. 21 clusters, 2.2s / 7.4s. 

Figure 12: Hierarchical combination of shadow maps: results obtained with different error thresholds, requiring more and more shadow 
textures to be computed. Two timings are given for each image (3 textures in each), for texture resolutions of 128 2 and 25 6 2 pixels. 




Figure 13: Shadows cast by complex light sources: (a) 3D source (b) set of small sources (c) light source images (d) elongated sources . 



evolution of computer technology. We also note that DSP chips are 
commonly found on multimedia computers, and significantly ac- 
celerate FFT calculations. In fact, tests run on DSP-equipped 02 
computers show that the FFT cost for large images is comparable 
to that of the Onyx2. 

Finally, the algorithm is highly parallelisable. Not only can we 
compute FFTs in parallel, but also the recombination operations for 
blocker or receiver subdivision. 

As for any approximation method, there exist extremal cases 
where our algorithm does not work properly. One example of this 
could be obtained using a large blocking polygon that lies in a plane 
containing the direction D. In such a case, the blocker image is 
nearly empty and hardly no shadow is produced. Subdividing the 
source into two regions that see a particular side of the polygon 
and adding the associated shadow maps together would correct this 
problem. 

Large objects touching the receiver also produce bad configura- 
tions unless they can be subdivided because the ideal a values for 
such objects range from 0 (for parts of the polygons that touch the 
receiver) to larger values that produce smoother shadows. For a 
table lying on the floor, for example, although the shadows are pro- 
duced in the right place, they appear to be too smooth where the 
table legs touch the floor. In such a configuration, explicit sampling 
methods would produce better results [13]. 

Figure 14: A single 128 x 128 shadow map was computed for the 9 Conclusions and Future Work 

cluster of cubes, and used to obtain shadows on each individual 

cube according to its location in space We have presented a new calculation method for soft shadows from 

extended light sources. The method is based on the expression 




of visibility functions using convolution operations. It allows the 
simulation of soft shadows from complex light sources or clusters, 
having complex shapes and non-uniform exitance distributions. Re- 
ceivers can be individual surfaces or object clusters, in which case 
shadows are correctly cast on all objects of the cluster. Occluders 
can be arbitrary object clusters. 

The approximations introduced by the formulation as a convo- 
lution have been discussed, and a hierarchical algorithm has been 
proposed for the combination of shadow maps from sub-clusters. A 
subdivision criterion was derived to limit the error incurred in the 
size of the penumbra regions. The algorithm is automatic and can 
be readily integrated in existing rendering systems. 

Future work includes the extension of the convolution approach 
to other illumination problems. For instance the illumination by 
the hemispherical sky dome can also be expressed as a convolu- 
tion. This was first shown by Max in a restricted case where a 
horizontal plane is used to model skylight [17]. The expression of 
the illumination kernel, in the absence of occlusion and for parallel 
source/receiver pairs, is also a convolution. 

The current hierarchical combination algorithm will not be able 
to compute an exact shadow if the clusters contain an object whose 
"vertical" extent (along the direction of interest) is too large. More 
elaborate refinement criteria should include provisions to identify 
such cases and provide alternate methods to compute associated 
shadow maps, which can then be combined in the same way with 
those obtained by convolution. 

Another important research direction is the re-use of source and 
occluder images: saving the cost of the associated FFT would sig- 
nificantly accelerate the process for large textures. Image-based 
rendering methods could perhaps be adapted to derive such images 
from a set of precomputed images. Such a derivation would have to 
take place in the Fourier domain to be really effective. 
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